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Abstract 

SMT-based checking of refinement types for call-by-value lan- 
guages is a well-studied subject. Unfortunately, the classical trans- 
lation of refinement types to verification conditions is unsound un- 
der lazy evaluation. When checking an expression, such systems 
implicitly assume that all the free variables in the expression are 
bound to values. This property is trivially guaranteed by eager, but 
does not hold under lazy, evaluation. Thus, to be sound and precise, 
a refinement type system for Haskell and the corresponding verifi- 
cation conditions must take into account which subset of binders 
actually reduces to values. We present a stratified type system that 
labels binders as potentially diverging or not, and that (circularly) 
uses refinement types to verify the labeling. We have implemented 
our system in LlQUIDHASKELL and present an experimental eval- 
uation of our approach on more than 10,000 lines of widely used 
Haskell libraries. We show that LlQUIDHASKELL is able to prove 
96% of all recursive functions terminating, while requiring a mod- 
est 1.7 lines of termination-annotations per 100 lines of code. 

1. Introduction 

Refinement types encode invariants by composing types with SMT- 
decidable refinement predicates 1 27 37 ] , generalizing Floyd-Hoare 
Logic (e.g. EscJava 1 14|) for functional languages. For example 

type Pos = {v:Int I v > 0} 
type Nat = {v:Int I v >= 0} 

are the basic type int refined with logical predicates that state 
that "the values" v described by the type are respectively strictly 
positive and non-negative. We encode pre- and poif-conditions 
(contracts) using refined function types like 

div :: n : Nat -> d:Pos -> {v:Nat I v <= n} 

which states that the function div requires inputs that are respec- 
tively non-negative and positive, and ensures that the output is less 
than the first input n. If a program containing div statically type- 
checks, we can rest assured that executing the program will not 
lead to any unpleasant divide-by-zero errors. By combining types 
and SMT based validity checking, refinement types have auto- 
mated the verification of programs with recursive datatypes, higher- 
order functions, and polymorphism. Several groups have used re- 
finements to statically verify properties ranging from simple array 
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safety |26 37] to functional correctness of data structures 1201 , se- 
curity protocols 1 4 1, and compiler correctness 1311 . 

Given the remarkable effectiveness of the technique, we em- 
barked on the project of developing a refinement type based veri- 
fier for Haskell. The previous systems were all developed for eager, 
call-by-value languages, but we presumed that the order of evalua- 
tion would surely prove irrelevant, and that the soundness guaran- 
tees would translate to Haskell's lazy, call-by-need regime. 

We were wrong. Our first contribution is to show that standard 
refinement systems crucially rely on a property of eager languages: 
when analyzing any term, one can assume that all the free vari- 
ables appearing in the term are bound to values. This property lets 
us check each term in an environment where the free variables are 
logically constrained according to their refinements. Unfortunately, 
this property does not hold for lazy evaluation, where free variables 
can be lazily substituted with arbitrary (potentially diverging) ex- 
pressions, which breaks soundness (SQ. 

The two natural paths towards soundness are blocked by chal- 
lenging problems. The first path is to conservatively ignore free 
variables except those that are guaranteed to be values e.g. by pat- 
tern matching, seq or strictness annotations. While sound, this 
leads to a drastic loss of precision. The second path is to explicitly 
reason about divergence within the refinement logic. This would be 
sound and precise - however it is far from obvious to us how to 
re-use and extend existing SMT machinery for this purpose. (!|8} 

Our second contribution is a novel approach that enables sound 
and precise checking with existing SMT solvers, using a stratified 
type system that labels binders as potentially diverging or not (Q. 
While previous stratified systems [ 10] would suffice for soundness, 
we show how to recover precision by using refinement types to 
develop a notion of terminating fixpoint combinators that allows the 
type system to automatically verify that a wide variety of recursive 
functions actually terminate (!|5j. 

Our third contribution is an extensive empirical evaluation of 
our approach on more than 10, 000 lines of widely used complex 
Haskell libraries. We have implemented our approach in LlQUID- 
HASKELL, an SMT based verifier for Haskell. LlQUIDHASKELL is 
able to prove 96% of all recursive functions terminating, requiring 
a modest 1.7 lines of termination annotations per 100 lines of code, 
thereby enabling the sound, precise, and automated verification of 
functional correctness properties of real-world Haskell code (ij6](. 

2. Overview 

We start with an overview of our contributions. After recapitulat- 
ing the basics of refinement types we illustrate why the classical 
approach based on verification conditions (VCs) is unsound due to 
lazy evaluation. Next, we step back to understand precisely how 
the VCs arise from refinement subtyping, and how subtyping is 
different under eager and lazy evaluation. In particular, we demon- 
strate that under lazy, but not eager, evaluation, the refinement type 
system, and hence the VCs, must account for divergence. Conse- 
quently, we develop a type system that accounts for divergence in 
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a modular and syntactic fashion, and illustrate its use via several 
small examples. Finally, we show how a refinement-based termi- 
nation analysis can be used to improve precision, yielding a highly 
effective SMT-based verifier for Haskell. 

2.1 Standard Refinement Types: From Subtyping to VC 

First, let us see how standard refinement type systems | 21 , 26 1 will 
use the refinement type aliases Pos and Nat and the specification 
for div from SjTJto accept good and reject bad. We use the syntax 
of Figure [T| where r is a refinement expression, or just refinement 
for short. We will vary the expressiveness of the language of refine- 
ments in different parts of the paper. 



good 
good x y 

bad 

bad x y 



: Nat 
let 2 



-> Nat -> Int 
= y + 1 in x 



Miv " 



Nat -> Nat 
< Miv' y 



Int 



Refinement Subtyping To analyze the body of bad, the refinement 
type system will check that the second parameter y has type Pos at 
the call to div; formally, that the actual parameter y is a subtype of 
the type of div's second input, via a subtyping query: 



h {y:Int y > 0} ■< {v:Int v > 0} 



x:{x:Int x > 0}, 
y:{y:Int y > 0} 

We use the Abbreviations of Figure [T]to simplify the syntax of the 
queries. So the above query simplifies to: 

x:{x > 0}, y:{y > 0} h {v > 0} X {v > 0} 

Verification Conditions To discharge the above subtyping query, 
a refinement type system generates a verification condition (VC), 
a logical formula that stipulates that under the assumptions cor- 
responding to the environment bindings, the refinement in the sub- 
type implies the refinement in the super-type. We use the translation 
d ■ D shown in Figure[T|to reduce a subtyping query to a verification 
condition. The translation of a basic type into logic is the refine- 
ment of the type. The translation of an environment is the conjunc- 
tion of its bindings. Finally, the translation of a binding x:t is the 
embedding of r guarded by a predicate denoting that "x is a value". 
For now, let us ignore this guard and see how the subtyping query 
for bad reduces to the classical VC: 

(x > 0) A (y > 0) => (v > 0) => (v > 0) 

Refinement type systems are carefully engineered so that (un- 
like with full dependent types) the logic of refinements precludes 
arbitrary functions and only includes formulas from efficiently de- 
cidable logics, e.g. the quantifier-free logic of linear arithmetic and 
uninterpreted functions (QF-EUFLIA). Thus, VCs like the above 
can be efficiently validated by SMT solvers 1 11]. In this case, the 
solver will reject the above VC as invalid meaning the implication, 
and hence, the relevant subtyping requirement does not hold. So the 
refinement type system will reject bad. 

On the other hand, a refinement system accepts good. Here, +'s 
type exactly captures its behaviour into the logic: 



( + ) 



x:Int -> y:Int -> {v:Int 



x + y} 



Thus, we can conclude that the divisor z is a positive number. The 
subtyping query for the argument to div is 

x:{x> 0},y:{y > 0}, 
z:{z = y + l} 

which reduces to the valid VC 

(x > 0) A (y > 0)A 
(z = y + l) 



h{v 



(v 



1} X {v > 0} 



(v>0) 



Refinements 
Basic Types 
Types 
Environment 



r 
b 

T 

r 



Subtyping 
Abbreviations 

x:{r} 
{x | r} 
{r} 

{x:{y:Int \ r y } \ r x } 
Translation 

flrh 6j x b 2 \) 

(|{x:Int | r}|) 
(\x:{v:Int \ r}\) 
(\x:(y:r y -> t)D 
Qa^n, . . . , x n :r n \) 



. . . varies . . . 
{v.Int | r} 

b | x:t — > t 

0 | x-.r.r 

r h Tl < T 2 



a;:{a;:Int | r} 
{a;:Int | r} 
{v.Int | r} 

{x:Int | r x A r y [x/y]} 
r 

"x is a value" =>■ r [x/v] 
true 

fla^nD A ... A (\x n -.T n \) 



Figure 1. Notation: Types, Subtyping & VCs 



2.2 Lazy Evaluation Makes VCs Unsound 

To generate the classical VC, we ignored the "x is a value" guard 
that appears in the embedding of a binding (ja^r]) (Figure [TJ. 
Under lazy evaluation, ignoring this "is a value" guard can lead 
to unsoundness. Consider 



diverge 
diverge n 



: Int -> {v:Int I false} 
diverge n 



The output type captures the post-condition that the function re- 
turns an int satisfying false. This counter-intuitive specifica- 
tion states, in essence, that the function does not terminate, i.e. 
does not return any value. Any standard refinement type checker 
(or Floyd-Hoare verifier like DafnjQ will verify the given signa- 
ture for diverge via the classical method of inductively assuming 
the signature holds for diverge and then guaranteeing the signa- 
ture [ 16, 23 j . Next, consider the call to div in explode: 

explode : : Int -> Int 
explode x = let {n = diverge 1; y = 0} 
in x Miv 1 y 

To analyze explode, the refinement type system will check that y 
has type Pos at the call to div, i.e. will check that 

n:{false}, y:{y = 0} h {v = 0} X {v > 0} (1) 

In the subtyping environment n is bound to the type corresponding 
to the output type of diverge, and y is bound to the singleton 
type stating y equals 0. In this environment, we must prove that 
actual parameter's type - i.e. that of y - is a subtype of Pos. The 
subtyping, using the embedding of Figure[TJand ignoring the "is a 
value" guard, reduces to the VC: 

false A y = 0 => (v = 0) => (v > 0) (2) 

The SMT solver proves this VC valid by using the contradiction in 
the antecedent, thereby unsoundly proving the call to div safe! 

Eager vs. Lazy Verification Conditions At this point, we pause 
to emphasize that the problem lies in the fact that the classical 



1 http://rise4fun.com/Dafny/wVGc 
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technique for encoding subtyping (or generally, Hoare's "rule of 
consequence" |16|) with VCs is unsound under lazy evaluation. 
To see this, observe that the VC {2} is perfectly sound under eager 
(strict, call-by-value) evaluation. In the eager setting, the program is 
safe in that div is never called with the divisor 0, as it is not called 
at all! The inconsistent antecedent in the VC logically encodes the 
fact that, under eager evaluation, the call to div is dead code. Of 
course, this conclusion is spurious under Haskell's lazy semantics. 
As n is not required, the program will dive headlong into evaluating 
the div and hence crash, rendering the VC meaningless. 

The Problem is Laziness Readers familar with fully dependently 
typed languages like Cayenne 1 1|, Agda 1241 . Coq 0, or Idris (7), 
may be tempted to attribute the unsoundness to the presence of ar- 
bitrary recursion and hence non-termination (e.g. diverge). While 
it is possible to define a sound semantics for dependent types that 
mention potentially non- terminating expressions 1211 . it is not clear 
how to reconcile such semantics with decidable type checking. 

Refinement type systems avoid this situation by carefully re- 
stricting types so that they do not contain arbitrary terms (even 
through substitution), but rather only terms from restricted logics 
that preclude arbitrary user-defined functions [131 1311 1371 . Very 
much like previous work, we enforce the same restriction with a 
well-formedness condition on refinements (WF-BASE-D in Fig.|SJ. 

However, we show that this restriction is plainly not sufficient 
for soundness when laziness is combined with non-termination, as 
binders can be bound to diverging expressions. Unsurprisingly, in a 
strongly normalizing language the question of lazy or strict seman- 
tics is irrelevant for soundness, and hence an "easy" way to solve 
the problem would be to completely eliminate non-termination and 
rely on the soundness of previous refinement or dependent type sys- 
tems! Instead, we show here how to recover soundness for a lazy 
language without imposing such a drastic requirement. 

2.3 Semantics, Subtyping & Verification Conditions 

To understand the problem, let us take a step back to get a clear 
view of the relationship between the operational semantics, sub- 
typing, and verification conditions. We use the formulation of 
evaluation-order independent refinement subtyping developed for 
\ H 1211 in which refinements r are arbitrary expressions e from 
the source language. We define a denotation for types and use it to 
define subtyping declaratively. 

Denotations of Types and Environments Recall the type Pos de- 
fined as {v : int I 0 < v} . Intuitively, Pos denotes the setof int 
expressions which evaluate to values greater than 0. We formalize 
this intuition by defining the denotation of a type as: 

\{x:t I r}] = {e | 0 h e : r, if e <— >* w then r [w/x] "— ►* true} 

That is, the type denotes the set of expressions e that have the 
corresponding base type r which diverge or reduce to values that 
make the refinement true. The guard e »->•* w is crucially required 
to prove soundness in the presence of recursion. Thus, quoting |21 1, 
"refinement types specify partial and not total correctness". 

An environment Y is a sequence of type bindings, and a closing 
substitution 6 is a sequence of expression bindings: 

T = Xi'.Ti, . . . x n :r n 0 = X! H> e t , . . . , x n h-> e„ 

Thus, we define the denotation of F as the set of substitutions: 

PI = {9 | Vx:r G T.8(x) e [0(t)]} 

Declarative Subtyping Equipped with interpretations for types and 
environments, we define the declarative subtyping X-BASE (over 
basic types b, shown in Figure [T]( to be containment between the 



types' denotations: 

r h {v.B | n} x {v.B | r 2 } ' L 

Let us revisit the explode example from j ]2.2| recall that the func- 
tion is safe under eager evaluation but unsafe under lazy evaluation. 
Let us see how the declarative subtyping allows us to reject in the 
one case and accept in the other. 

Declarative Subtyping with Lazy Evaluation Let us revisit the 
query {T} to see whether it holds under the declarative subtyping 
rule X-BASE. The denotation containment 

W G[n:{false}, y:{y = O}].[0 {v = 0}] C {6 {v > 0}] (3) 

does not hold. To see why, consider a 6 that maps n to any diverging 
expression of type Int and y to the value 0. Then, 0 £ \8 {v = 0}] 
but 0 0 \6 {v > 0}], thereby showing that the denotation contain- 
ment does not hold. 

Declarative Subtyping with Eager Evaluation Since denotational 
containment |3| does not hold, \ H cannot verify explode under 
eager evaluation. However, Belo et al. |3| note that under eager 
(call-by-value) evaluation, each binder in the environment is only 
added after the previous binders have been reduced to values. 
Hence, under eager evaluation we can restrict the range of the 
closing substitutions to values (as opposed to expressions). Let us 
reconsider |3| in this new light: there is no value that we can map 
n to, so the set of denotations of the environment is empty. Hence, 
the containment {3} vacuously holds under eager evaluation, which 
proves the program safe. Belo's observation is implicitly used by 
refinement types for eager languages to prove that the standard (i.e. 
under call-by-value) reduction from subtyping to VC is sound. 

Algorithmic Subtyping via Verification Conditions The above 
subtyping (X-BASE) rule allows us to prove preservation and 
progress 1211 but quantifies over evaluation of arbitrary expres- 
sions, and so is undecidable. To make checking algorithmic we 
approximate the denotational containment using verification con- 
ditions (VCs), formulas drawn from a decidable logic, that are valid 
only if the undecidable containment holds. As we have seen, the 
classical VC is sound only under eager evaluation. Next, let us use 
the distinctions between lazy and eager declarative subtyping, to 
obtain both sound and decidable VCs for the lazy setting. 

Step 1: Restricting Refinements To Decidable Logics Given that in 
X H refinements can be arbitrary expressions, the first step towards 
obtaining a VC, regardless of evaluation order, is to restrict the 
refinements to a decidable logic. We choose the quantifier free 
logic of equality, uninterpreted functions and linear arithmetic (QF- 
EUFLIA). We design our typing rules to ensure that for any valid 
derivation, all the refinements belong in this restricted language. 

Step 2: Translating Containment into VCs Our goal is to encode 
the denotation containment antecedent of X-BASE 

V# G [T]. \6({v.B | n})] C {6{{v:B | r 2 })] (4) 

as a logical formula, that is valid only when the above holds. Intu- 
itively, we can think of the closing substitutions 9 as corresponding 
to assignments(\6\j of variables X of the VC. We use the variable 
x to approximate denotational containment by stating that if x be- 
longs to the type {v.B \ n} then x belongs to the type {v.B r 2 }: 

\/X e dom{Y), x.(\Y\) => (\x:{v.B | rjj) => <\x:{v:B \ r 2 }\) 

where (\T\) and ds:r|) are respectively the translation of the envi- 
ronment and bindings into logical formulas that are only satisfied 
by assignments (]#|) as shown in Figure [T] Using the translation of 
bindings, and by renaming x to v, we rewrite the the condition as 

VX G dom(Y), v.(\Y\) => ("v is a value" => n) 
=> ("v is a value" =>■ rs) 
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Type refinements are carefully chosen to belong to the decidable 
logical sublanguage QF-EUFLIA, thus we directly translate type 
refinements into the logic. Thus, what is left is to translate into 
logic the environment and the "is a value" guards. We postpone 
translation of the guards as we approximate the above formula by a 
stronger, i.e. sound with respect to|4] VC that just omits the guards: 

vx g rfom(r), «.(]rD =>n=>n 

To translate environments, we conjoin their bindings' translations: 

fla^Ti, . . . , x n :r n \) = flaii :tiD A ... A (]x„:t„|) 

However, since types denote partial correctness, the translations 
must also explicitly account for possible divergence: 

(|a;:{ii:Iiit | r}\) = "x is a value" => r[x/v] 

That is, we cannot assume that each x satisfies its refinement r; we 
must guard that assumption with a predicate stating that x is bound 
to a value (not a diverging term.) 

The crucial question is: how can one discharge these guards to 
conclude that x indeed satisfies r? One natural route is to enrich the 
refinement logic with a predicate that states that "x is a value", and 
then use the SMT solver to explicitly reason about this predicate 
and hence, divergence. Unfortunately, we show in S|8] that such 
predicates lead to three-valued logics, which fall outside the scope 
of the efficiently decidable theories supported by current solvers. 
Hence, this route is problematic if we want to use existing SMT 
machinery to build automated verifiers for Haskell. 

2.4 Our Answer: Implicit Reasoning About Divergence 

One way forward is to implicitly reason about divergence by elimi- 
nating the "x is a value" guards (i.e. value guards) from the VCs. 
Implicit Reasoning: Eager Evaluation Under eager evaluation 
the domain of the closing substitutions can be restricted to val- 
ues (3). Thus, we can trivially eliminate the value guards, as they 
are guaranteed to hold by virtue of the evaluation order. Returning 
to explode, we see that after eliminating the value guards, we get 
the VC |2| which is, therefore, sound under eager evaluation. 

Implicit Reasoning: Lazy Evaluation However, with lazy evalua- 
tion, we cannot just eliminate the value guards, as the closing sub- 
stitutions are not restricted to just values. Our solution is to take 
this reasoning out of the hands of the SMT logic and place it in 
the hands of a stratified type system. We use a non-deterministic 
/3-reduction (formally defined in fj3j to label each type as: A Div- 
type, written r, which are the default types given to binders that 
may diverge, or, a Wnf-type, written r^, which are given to binders 
that are guaranteed to reduce, in a finite number of steps, to Haskell 
values in Weak Head Normal Form (WHNF). Up to now we only 
discussed Int basic types, but our theory supports user-defined al- 
gebraic data types. An expression like 0 : repeat 0 is an infinite 
Haskell value. As we shall discuss, such infinite values cannot be 
represented in the logic. To distinguish infinite from finite values, 
we use a Fin-type, written t , to label binders of expressions that 
are guaranteed to reduce to finite values with no redexes. This strat- 
ification lets us generate VCs that are sound for lazy evaluation. Let 
B be a basic labelled type. The key piece is the translation of envi- 
ronment bindings: 



<\x:{v:B | r}|) = 



true, if B is a Div type 
r [x/v] , otherwise 



That is, if the binder may diverge, we simply omit any constraints 
for it in the VC, and otherwise the translation directly states (i.e. 
without the value guard) that the refinement holds. Returning to 
explode, the subtyping query {7} yields the invalid VC 

true =>v = 0=>v>0 

and so explode is soundly rejected under lazy evaluation. 



As binders appear in refinements, and binders may refer to 
potentially infinite computations (e.g. [ 0 . . ] ), we must ensure that 
refinements are well defined (i.e. do not diverge). We achieve this 
via stratification itself, i.e. by ensuring that all refinements have 



type Bool . By Corollary [lj this suffices to ensure that all the 
refinements are indeed well-defined and converge. 

2.5 Verification With Stratified Types 

While it is reassuring that the lazy VC soundly rejects unsafe 
programs like explode, we now demonstrate by example that 
it usefully accepts safe programs. First, we show how the basic 
system - all terms have Div types - allows us to prove "partial 
correctness" properties without requiring termination. Second, we 
show how to extend the basic system by using Haskell's pattern 
matching semantics to assign the pattern match scrutinees Wnf 
types, thereby increasing the expressiveness of the verifier. Third, 
we show how to further improve the precision and usability of the 
system by using a termination checker to assign various terms Fin 
types. Fourth, we close the loop, by illustrating how the termination 
checker can itself be realized using refinement types. Finally, we 
use the termination checker to ensure that all refinements are well- 
defined (i.e. do converge.) 

Example: VCs and Partial Correctness The first example illus- 
trates how, unlike Curry-Howard based systems, refinement types 
do not require termination. That is, we retain the Floyd-Hoare 
notion of "partial correctness", and can verify programs where 
all terms have Div-types. Consider exl which uses the result of 
collatz as a divisor. 



exl 
exl 



: Int 
let > 



-> Int 

= collatz n in 10 "div 1 



collatz : : 
collatz n 



-> {v:Int 



1} 



even n 
otherwise 



collatz 
collatz 



(n / 2) 
(3*n + 1) 



The jury is still out on whether the collatz function terminates, 
but it is easy to verify that its output is a Div int equal to 1. At 
the call to div the parameter x has the output type of collatz, 
yielding the subtyping query: 

x:{v:Int | v = 1} h {v = 1} ^ {v > 0} 

where the sub-type is just the type of x. As Int is a Div type, the 
above reduces to the VC (true =>• v = 1 =>■ v>0) which the 
SMT solver proves valid, thereby verifying exl. 

Example: Improving Precision By Forcing Evaluation If all 

binders in the environment have Div-types then, effectively, the 
verifier can make no assumptions about the context in which a term 
evaluates, which leads to a drastic loss of precision. Consider: 



ex2 



let {x = 1; y = inc x} in 10 Miv' y 



inc :: z : Int -> {v:Int I v > z } 
inc = \z -> z + 1 

The call to div in ex2 is obviously safe, but the system would 
reject it, as the call yields the subtyping query: 

x:{x:Int | x = 1}, y:{y:Int | y > x} h {v > x} -< {v > 0} 

Which, as x is a Div type, reduces to the invalid VC 

true => v > x => v > 0 

We could solve the problem by forcing evaluation of x. In Haskell 
the seq operator or a bang-pattern can be used to force evaluation. 
In our system the same effect is achieved by the case-of primitive: 
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inside each case the matched binder is guaranteed to be a Haskell 
value in WHNF. This intuition is formalized by the typing rule (T- 
CASE-D), which checks each case after assuming the scrutinee and 
the match binder have Wnf types. 

If we force x's evaluation, using the case primitive, the call to 
div yields the subtyping query: 

x:{x:Int l 



x = l} 



h {v > x} ^ {v > 0} 



(5) 



y:{y:Int y > x} 
As x is Wnf, we accept ex 2 by proving the validity of the VC 

x = l=>v>x=>v>0 (6) 

Example: Improving Precision By Termination While forcing 
evaluation allows us to ensure that certain environment binders 
have non-Div types, it requires program rewriting using case- 
splitting or the seq operator which leads to non-idiomatic code. 

Instead, our next key optimization is based on the observation 
that in practice, most terms don't diverge. Thus, we can use a 
termination analysis to aggressively assign terminating expressions 
Fin types, which lets us strengthen the environment assumptions 
needed to prove the VCs. For example, in the ex2 example the 
term 1 obviously terminates. Hence, we type x as Int , yielding 
the subtyping query for div application: 

x:{x:Int 4 x = 1} , r -, , r , 

r x + i ' i h {v > xj ^ {v > 0} (7) 

y:{y:int I y > x l 

As x is Fin, we accept ex2 by proving the validity of the VC 

x=l=>v>x=>v>0 (8) 

Example: Verifying Termination With Refinements While it is 
straightforward to conclude that the term 1 does not diverge, how 
do we do so in general? For example: 



let {x 



f 9; y 



inc x} in 1C 



'div' y 



f 

f n 



: Nat -> (v:Int I v = 1} 
if n == 0 then 1 else f (n-1) 



We check the call to div via subtyping query {7} and VC <j8j, which 
requires us to prove that f terminates on all Nat 4 inputs. 

We solve this problem by showing how refinement types may 
themselves be used to prove termination, by following the classical 
recipe of proving termination via decreasing metrics |32| as em- 
bodied in sized types 1171 1361 . The key idea is to show that each 
recursive call is made with arguments of a strictly smaller size, 
where the size is itself a well founded metric, e.g. a natural number. 

We formalize this intuition by type checking recursive proce- 
dures in a termination-weakened environment where the procedure 
itself may only be called with arguments that are strictly smaller 
than the current parameter (using terminating fixpoints of |4.2| ) 
For example, to prove f terminates, we check its body in an envi- 
ronment 

n : Nat 4 f : {n':Nat 4 n' < n} -> {v = 1} 

where we have weakened the type of f to stipulate that it only be 
(recursively) called with Nat values n' that are strictly less than 
the (current) parameter n. The argument of f exactly captures these 
constraints, as using the Abbreviations of Figure[T|the argument of 
f is expanded to {n':Int 4 n' < n A n' >= 0}. The body type- 
checks as the recursive call generates the valid VC 



0 < n A -.(0 = n) 



1 ^> (0 < v < n) 



Example: Diverging Refinements In this final example we discuss 
why refinements should always converge and how we statically 
ensure convergence. Consider the invalid specification 



Definition 




def 


::= measure / :: r 






Equation 




eq 


::= f(Dx) = r 


Equation to Type 




<\f(Dx) = r\) 


= D :: x:t — > {v:r j / v = r} 



Figure 2. Syntax of Measures 



that states that the value of a diverging integer is 12. The above 
specification should be rejected, as the refinement v = 12 does not 
evaluate to true (diverge 0 = 12 true), instead it diverges. 

We want to check the validity of the formula v = 12 under a 
model that maps v to the diverging integer diverge 0. Any system 
that decides this formula to be true will be unsound, i.e. the VCs 
will not soundly approximate subtyping. For similar reasons, the 
system should not decide that this formula is false. To reason about 
diverging refinements one needs three valued logic, where logical 
formulas can be solved to true, false, or diverging. Since we want to 
discharge VC using SMT solvers that currently do not support three 
valued reasoning, we exclude diverging refinements from types. To 
do so, we restrict = to finite integers 



and we say that {v.B \ r} is well-formed iff r has a Bool 4 type 
(Corollary[T}. Thus the initial invalid specification will be rejected 
as non well-formed. 

2.6 Measures: From Integers to Data Types 

So far, all our examples have used only integer and boolean expres- 
sions in refinements. To describe properties of algebraic data types, 
we use measures, introduced in prior work on Liquid Types |20|. 
Measures are inductively defined functions that can be used in re- 
finements, and provide an efficient way to axiomatize properties of 
data types. For example, emp determines whether a list is empty: 



emp 



measure 

emp L J 
emp (x:xs) 



: [Int] 
true 
false 



Bool 



The syntax for measures deliberately looks like Haskell, but it is 
far more restricted, and should really be considered as a separate 
language. A measure has exactly one argument, and is defined by 
a list of equations, each of which has a simple pattern on the left 
hand side (see Figure [2](. The right-hand side of the equation is a 
refinement expression r. Measure definitions are typechecked in 
the usual way; we omit the typing rules which are standard. (Our 
metatheory does not support type polymorphism, so in this paper 
we simply reason about lists of integers; however, our implementa- 
tion supports polymorphism.) 



Denotational semantics 

\ H in $23 



The denotational semantics of types in 



diverge 0 



{v:Int I v 



12} 



is readily extended to support measures. In A a re- 
finement r is an arbitrary expression, and calls to a measure are 
evaluated in the usual way by pattern matching. For example, with 
the above definition of emp it is straightforward to show that 

[1,2,3] :: {v:[lnt] | not (emp v)} (9) 

as the refinement not (emp ([1, 2, 3])) evaluates to true. 

Measures as Axioms How can we reason about invocations of 
measures in the decidable logic of VCs? A natural approach is 
to treat a measure like emp as an uninterpreted function, and add 
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logical axioms that capture its behaviour. This looks easy: each 
equation of the measure definition corresponds to an axiom, thus: 

emp [] = true 

Vx, xs. emp (x : xs) = false 

Under these axioms the judgement[9]is indeed valid. 

Measures as Refinements in Types of Data Constructors Axiom- 
atizing measures is precise; that is, the axioms exactly capture the 
meaning of measures. Alas, axioms render SMT solvers inefficient, 
and render the VC mechanism unpredictable, as one must rely on 
various brittle syntactic matching and instantiation heuristics [12]. 

Instead, we use a different approach that is both precise and 
efficient. The key idea is this: instead of translating each measure 
equation into an axiom, we translate each equation into a refined 
type for the corresponding data constructor |20|. This translation 
is given in Figure|2] For example, the definition of the measure emp 
yields the following refined types for the list data constructors: 

[] :: {v:[lnt] | emp v = 
: :: x:Int — > xs:[lntl 



true} 

— > {v:[lnt] | emp v = false} 



These types ensure that: (1) each time a list value is constructed, 
its type carries the appropriate emptiness information. Thus our 
system is able to statically decide that |9]( is valid, and, (2) each 
time a list value is matched, the appropriate emptiness information 
is used to improve precision of pattern matching, as we see next. 

Using Measures As an example, we use the measure emp to pro- 
vide an appropriate type for the head function: 



head 
head xs 



error 
error 



: {v: [Int] | not (emp v) } 
case xs of 

(x:_) -> x 

[] -> error "yikes" 

: [v:String | false} -> a 
undefined 



Int 



head is safe as its input type stipulates that it will only be called 
with lists that are not [ ] , and so error " ..." is dead code. The 
call to error generates the subtyping query 

xs:{xs:[lnt]^ | ^(empxs)} 



b:{b:[lnt]^ | (emp xs) = true} 



h {true} < {false} 



The match-binder b holds the result of the match |30|. In the 
[ ] case, we assign it the refinement of the type of [ ] which 

is (emp xs) = true. Since the call is done inside a case-of 
expressions both xs and b are guaranteed to be in WHNF, thus 

they have Wnf types. 

The verifier accepts the program as the above subtyping reduces 

to the valid VC 

^(emp xs) A ((emp xs) = true) => true =>■ false 

Consequently, our system can naturally support idiomatic Haskell, 
e.g. taking the head of an infinite list: 

ex x = head (repeat x) 



repeat 
repeat y 



: Int -> {v: [Int] 
y : repeat y 



not (emp v) } 



Multiple Measures If a type has multiple measures, we simply 
refine each data constructor's type with the conjunction of the 
refinements from each measure. For example, consider a measure 
that computes the length of a list: 



measure len 

len ([]) 
len (x : xs ) 



[Int] 



-> Int 



Constants 

Values 
Expressions 

Refinements 
Basic Types 
Types 

Contexts 

Reduction 



w 
e 

r 
B 
r 

c 



:= 0,1,-1,... | true, false 
+ ,—,... | =,<,.. . | crash 

:= c | Xx.e | D e 

\— iv | x | e e | let x = e in e 
case x = e of {D x — > e} 

— e 

= Int | Bool | T 

= {v.B | r} | x:t — > r 

= • | Ce | cC | DeCe 
case x — C of {D y — > e} 



if e 1 



1 + len xs 



C[e] C[e'] 

cv •— > S(c, v) 

(Xx.e) e x e[e x /x] 

let x = e x in e <—> e[e x /x] 

case x — Dj e of {Di yl — > e^} ■— > Cj [Dj e/x] [e/yj] 

Figure 3. X u : Syntax and Operational Semantics 



Using the translation of Figure|2] we extract the following types for 
list's data constructors. 

[] :: {v:[lnt] len v = 0} 

: :: x:Int — > xs:[lnt] — > {v:[lnt] | len v = 1 + (len xs)} 

The final types for list data constructors will be the conjunction of 
the refinements from len and emp: 

[] :: {v:[lnt] | emp v = true A len v = 0} 
: :: x:Int — > xs:[lnt] — > 

{v:[lnt] | emp v = false A len v = 1 + (len xs)} 

3. Declarative Typing: \ u 

Next, we formalize our stratified refinement type system, in two 
steps. First, in this section, we present a core calculus X u , with a 
general /3-reduction semantics. We describe the syntax, operational 
semantics, and sound but undecidable declarative typing rules for 
X u . Second, in fjj] we describe QF-EUFLIA, a subset of X u that 
forms a decidable logic of refinements, and use it to obtain X D with 
decidable SMT-based algorithmic typing. 

3.1 Syntax 

Figure 3 summarizes the syntax of X u , which is essentially the cal- 
culus ™ 1 21 ] without the dynamic checking features (like casts), 
but with the addition of data constructors. In X u , as in X H , refine- 
ment expressions r are not drawn from a decidable logical sublan- 
guage, but can be arbitrary expressions e (hence r ::— e in Fig- 
ure [3](. This choice allows us to prove preservation and progress, 
but renders typechecking undecidable. 

Constants The primitive constants of X u include true, false, 0, 
1,-1, etc., and arithmetic and logical operators like +,—,<,/, A, 
-i. In addition, we include a special untypable constant crash that 
models "going wrong". Primitive operations return a crash when 
invoked with inputs outside their domain, e.g. when / is invoked 
with 0 as the divisor, or when assert is applied to false. 

Data Constructors We encode data constructors as special con- 
stants. Each data type has an arity Arity(T) that represents the ex- 
act number of data constructors that return a value of type T. For 
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example the data type [Int], which represents lists of integers, has 
two data constructors: [] and :, i.e. has arity 2. 

Values & Expressions The values of X u include constants, A- 
abstractions Xx.e, and fully applied data constructors D that wrap 
expressions. The expressions of X u include values, as well as 
variables x, applications e e, and the case and let expressions. 

3.2 Operational Semantics 

Figure [3] summarizes the small step contextual /^-reduction seman- 
tics for X u . Note that we allow for reductions under data construc- 
tors, and thus, values may be further reduced. We write e t -+ 3 e' if 
there exist ei, . . . , e, such that e is ei, e' is ej and Vi, j, 1 < i < j, 
we have > e»+i. We write e e' if there exists some (finite) 
j such that e e'. 

Constants Application of a constant requires the argument be re- 
duced to a value; in a single step the expression is reduced to the 
output of the primitive constant operation. For example, consider 
=, the primitive equality operator on integers. We have 8(—,n) = 
=n where S(= n , m) equals true iff m is the same as n. 

3.3 Types 

X u types include basic types, which are refined with predicates, 
and dependent function types. Basic types B comprise integers, 
booleans, and a family of data-types T (representing lists, trees 
etc..) For example the data type [Int] represents lists of integers. 
We refine basic types with predicates (boolean valued expressions 
e) to obtain basic refinement types {v.B | e}. Finally, we have 
dependent function types x:t x — > r where the input x has the type 
r x and the output r may refer to the input binder x. 

Notation We write B to abbreviate {v.B | true}, and t x — > r to 
abbreviate x:t x — > r if x does not appear in r. We use _ for unused 
binders. We write {u:nat ! | r} to abbreviate {v.Int 1 | 0 < v A r}. 

Denotations Each type r denotes a set of expressions [r], that are 
defined via the dynamic semantics 1211 . Let \t\ be the type we 
get if we erase all refinements from r and e: [r\ be the standard 
typing relation for the typed lambda calculus. Then, we define the 
denotation of types as: 

[{a;:S | r}] = {e | e:B, if e "— >* w then r [w/x] true} 
\x:t x -»• t] ={e I e:\r x -> rJ,Ve^ G [t.J. ee s £ [t [e^/a;]]} 

Constants For each constant c we define its type Ty(c) such that 
c G [Ty(c)J. For example, 

Ty(3) = {v:Int v = 3} 

Ty(+) = x:Int — > y:Int — > {v:Int v = x + y} 

Ty(/) = Int -> {v:Int v > 0} -¥ Int 

Ty(error T ) = {v:Int | false} — > r 

So, by definition we get the constant typing lemma 

Lemma 1. [Constant Typing] Every constant c G [Ty(c)J. 

Thus, if Ty(c) = x:t x — > r, then for every value w G [r^J, 
we require that <5(c, w) G [r [iu/x]J. For every value w (jL {t x ], it 
suffices to define 5(c, w) as crash, a special untyped value. 

Data Constructors The types of data constructor constants are 
refined with predicates that track the semantics of the measures 
associated with the data type. For example, as discussed in §2.6| we 
use emp to refine the list data constructors' types: 

Ty(Q) = {v:[lnt] | emp v} 

Ty(:) = Int -S> [Int] -> {v:[lnt] | -.(emp v)} 

By construction it is easy to prove that Lemma [T| holds for data 
constructors. For example, emp [] goes to true. 



Well-Formedness 



r,v:B \- v r : Bool 
r hu {v.B I r} 

r \~u t x F, x:t x \~u t 
r hrr x:t x — > r 



WF-BASE 



WF-FUN 



Subtyping 



r h(7 T\ -< T'2 



W € lrU0({v-B \ C l8({v.B \ r 2 })] 

r \-u {v.B I n} < {v.B I r 2 } 

r \~u t' x ■< t x T, x:t' x \~u r <r' 



^-Base 



r h[/ x:t x — > t -< X'.tL — > T 



^-FUN 



Typing 



T\-u e:r 



T-Var 



(ayr) G r 

n-[,i:T T\-u c: Ty(c) 

r h[/ e : r' Tl-Lrr'^r r \~u t 

T, x:t x \~u e : t F\~u t x 



T-CON 



T-SUB 



T-FUN 



r \~u Xx.e : (x:t x — > r) 

F \- v ei : (s:^ — > r) T e 2 : 
r \~u ei e 2 : r [ea/as] 

F h[f : F, a;:^ \~u e : r rhur 

r he/ let x = in e : r 

T\-u e: {v.T r} V \~u T 
Vi.Ty(A) = yJ^J I r<} 

T, j/j :rj , a;:{ii:T | rAn} e, : r 



T-App 



T-Let 



r hu case 1 = eof {D; j/j — > e^} : r 



T-CASE 



Figure 4. Type-checking for A 
3.4 Type Checking 

Next, we present the type-checking judgments and rules of X u . 

Environments and Closing Substitutions A type environment Y 
is a sequence of type bindings x\\Ti, . . . , x n :r n . An environment 
denotes a set of closing substitutions 6 which are sequences of 
expression bindings: x\ h-> ei, . . . , x n i->- e n such that: 

[T] = {6 I Vx:t G T.8(x) G [9(t)\} 

Judgments We use environments to define three kinds of rules: 
Well-formedness, Subtyping, and Typing 141 12 II . A judgment V \~u 
t states that the refinement type r is well-formed in the environ- 
ment r. Intuitively, the type r is well-formed if all the refinements 
in t are Bool-typed in T. A judgment V \~u Ti r< T2 states that the 
type n is a subtype of r 2 in the environment Y. Informally, ri is 
a subtype of r 2 if, when the free variables of t\ and r 2 are bound 
to expressions described by T, the denotation of n is contained in 
the denotation of r 2 . Subtyping of basic types reduces to denota- 
tional containment checking. That is, for any closing substitution 9 
in the denotation of F, for every expression e, if e G [0(ti)] then 
e G [#(t 2 )]. A judgment F \~u e : r states that the expression e 
has the type r in the environment F. That is, when the free vari- 
ables in e are bound to expressions described by F, the expression 
e will evaluate to a value described by r. 
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Expressions, Values, Constants, Basic types: see Figure [3] 



Types 


r ::= 


{v.B | r} | {v.B 1 | r} 

a;:r — > r 




/ 


II 




Refinements 


T '. '■ 


P 




Pi'p fiifntpv 


P '■ '■ 

1 
1 


P = V 1 P < P 1 pAp | ^p 
n \ x f p p © p 
true | false 


Measures 


f,g,h 






Operators 


0 ::= 


+ 




Integers 


n ::= 


0 


1 1 -1 1 ... 


Domain 


d ::= 


n 


j c w j D d j true false 


Model 


a ::= 


Xl 


H-> di , . . . , X n I-+ d n 


Lifted Values 


vr 1 ::= 


c 


Xx.e D w 1 - _L 



Figure 5. Syntax of X D 

Soundness Following X H 1211 . we use the (undecidable) ^<-BASE 
to show that each step of evaluation preserves typing, and that if an 
expression is not a value, then it can be further evaluated: 

• Preservation: If 0 \~u e : r and e e', then 0 e' : r. 

• Progress: If 0 hfj e : r and e ^ w, then e e'. 

We combine the above to prove that evaluation preserves typing, 
and that a well typed term will not crash. 

Theorem 1. [Soundness of X u ] 

• Type-Preservation: If®\-{je:T,e^*w then 0 \~u w :r. 

• Crash-Freedom: If% \~u e : r then e crash. 

We prove the above following the overall recipe of ED. Crash- 
freedom follows from type-preservation and as crash has no type. 
The Substitution Lemma, in particular, follows from a connection 
between the typing relation and type denotations: 

Lemma 2. [Denotation Typing] If '0 \~u e : r then e € [rj. 

4. Algorithmic Typing: \ D 

While \ u is sound, it cannot be implemented thanks to the undecid- 
able denotational containment rule ^-BASE (Figure |4j. Next, we 
go from X u to X D , a core calculus with sound, SMT-based algo- 
rithmic type-checking in four steps. First, we show how to restrict 
the language of refinements to an SMT-decidable sub-language QF- 
EUFLIA ( §4.1[ l. Second, we stratify the types to specify whether 
their inhabitants may diverge, must reduce to values, or must re- 
duce to finite values ( §4.2| >. Third, we show how to enforce the 
stratification by encoding recursion using special fixpoint combi- 
nator constants ( §4.2[ (. Finally, we show how to use QF-EUFLIA 
and the stratification to approximate the undecidable ^-BASE with 
a decidable verification con ditio n ^-BASE-D, thereby obtaining 
the algorithmic system X D (j j4.3) , 

4.1 Refinement Logic: QF-EUFLIA 

Figure [5] summarizes the syntax of X D . Refinements r are now 
predicates p, drawn from QF-EUFLIA, the decidable logic of 
equality, uninterpreted functions and linear arithmetic |22|. Pred- 
icates p include linear arithmetic constraints, function application 
where function symbols correspond to measures (as described in 
j]2.6[l, and boolean combinations of sub-predicates. 



All rules as in Figure|4]except as follows 
Well-Formedness 

T.v.B \- D p : Bool^ 



rhn r 



Subtyping 



r,« : B 



r \- D {v.B 1 P } 

(]f>2 1) is valid 



WF-Base-D 



T \~D Tl < T2 



r ho {v.B I Pl } r< {v.B I P2 } 



^-Base-D 



Typing 



T \- D ei : (x:t x — > r) T\- D y :t x 
r \- D ei y : r [y/x] 

I & {4. 4-} =^ t is Div 
T \- D e : {v.T 1 I r} T \- D t 
Vi.Ty(A) = VjTi ~> W-T I n} 
yj^rjj x:{v:T^ \ r A r t } \- D ei : t 
r hu case x — e of {Di yj — > ei} : r 



rh D e:r 



T-APP-D 



T-CASE-D 



Figure 6. Typechecking for X D 



Well-Formedness For a predicate to be well-formed it should be 
boolean and arithmetic operators should be applied to integer terms, 
measures should be applied to appropriate arguments (i.e. emp is 
applied to [Int]), and equality or inequality to basic (integer or 
boolean) terms. Furthermore, we require that refinements, and thus 
measures, always evaluate to a value. We capture these require- 
ments by assigning appropriate types to operators and measure 
functions, after which we require that each refinement r has type 
Bool^ (rule WF-BASE-D in Figure|6]l. 

Assignments Figure [5] defines the elements d of the domain T> of 
integers, booleans, and data constructors that wrap elements from 
T>. The domain T> also contains a constant c w for each value w 
of X u that does not otherwise belong in D (e.g. functions or other 
primitives). An assignment a is a map from variables to T). 

Satisfiability & Validity We interpret boolean predicates in the logic 
over the domain T). We write a \— p if a is a model of p. We omit 
the formal definition for space. A predicate p is satisfiable if there 
exists a \— p. A predicate p is valid if for all assignments a \— p. 

Connecting Evaluation and Logic To prove soundness, we need to 
formally connect the notion of logical models with the evaluation 
of a refinement to true. We do this in several steps, briefly outlined 
for brevity. First, we introduce a primitive bottom expression _L 
that can have any Div type, but does not evaluate. Second, we 
define lifted values w ± (Figure^, which are values that contain _L. 
Third, we define lifted substitutions 9 ± , which are mappings from 
variables to lifted values. Finally, we show how to embed a lifted 
substitution # x into a set of assignments (l8 ± f) where, intuitively 
speaking, each _L is replaced by some arbitrarily chosen element of 
T>. Now, we can connect evaluation and logical satisfaction: 

Bool* 



Theorem 2. If 9 \- D 9 A 

0 ± (P) ^* 



(P) 



true if 



then 

' Va e (\e- 



.a \= p 



Restricting Refinements to Predicates Our goal is to restrict 
^<-BASE so that only predicates from the decidable logic QF- 
EUFLIA (not arbitrary expressions) appear in implications (\T\) => 
{v.b I pi} => {v.b I p 2 }. Towards this goal, as shown in Figures[5] 
and[6] we restrict the syntax and well-formedness of types to con- 
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tain only predicates, and we convert the program to ANF after 
which we can restrict the application rule T-APP-D to applications 
to variables, which ensures that refinements remain within the logic 
after substitution [26|. Recall, that this is not enough to ensure that 
refinements do converge, as under lazy evaluation, even binders 
can refer to potentially divergent values. 

4.2 Stratified Types 

The typing rules for X D are given in Figure[6] Instead of explicitly 
reasoning about divergence or strictness in the refinement logic, 
which leads to significant theoretical and practical problems, as 
discussed in !|8] we choose to reason implicitly about divergence 
within the type system. Thus, the second critical step in our path to 
\ D is the stratification of types into those inhabited by potentially 
diverging terms, terms that only reduce to values, and terms which 
reduce to finite values. Furthermore, the stratification crucially 
allows us to prove Theorem [2] which requires that refinements 
do not diverge (e.g. by computing the length of an infinite list) 
by ensuring that inductively defined measures are only applied 
to finite values. Next, we describe how we stratify types with 
labels, and then type the various constants, in particular the fixpoint 
combinators, to enforce stratification. 

Labels We specify stratification using two labels for types. The 
label J, (resp. J|) is assigned to types given to expressions that 
reduce (using /3-reduction as defined in Figure|3J to a value w (resp. 
finite value, i.e. an element of the inductively defined T>). Formally, 

Wnf types {{v.B 1 | r}] = {{v.B r}] n {e | e ■->•* w} (10) 

Fin types {{v.B 1 '' | r}] = {{v.B \ r}J n {e | e ■->•* d} (11) 

Unlabelled types are assigned to expressions that may diverge. Note 
that for any B and refinement r we have 

{{v.B* | r}] C {{v.B± | r}] C {{v.B | r}] 

The first two sets are equal for Int and Bool, and unequal for 
(lazily) constructed data types T. We need not stratify function 
types (i.e. they are Div types) as binders with function types do 
not appear inside the VC, and are not applied to measures. 

Enforcing Stratification We enforce stratification in two steps. 
First, the T-CASE-D rule uses the operational semantics of case- 
of to type-check each case in an environment where the scrutinee x 
is assumed to have a Wnf type. All the other rules, not mentioned 
in Figure [6] remain the same as in Figure [4] Second, we create 
stratified variants for the primitive constants and separate fixpoint 
combinator constants for (arbitary, potentially non-terminating) re- 
cursion (fix) and bounded recursion (tf ix). 

Stratified Primitives First, we restrict the primitive operators whose 
output types are refined with logical operators, so they are only 
invoked on finite arguments (so that the corresponding refinements 
are guaranteed to not diverge). 

Ty(n) = {i>:Int^ | v = n} 

Ty(=) = x-.B* -> y.B* -» {v.Bool* \ v <^> x = y} 

Ty(+) = x-.Int* -> y:Int* -> {v.Int* v = x + y} 

Ty(A) = xiBool 41 — ¥ y:Bool^ — > {ikBooI^ | iittiAj) 

It is easy to prove that the above primitives respect their stratifica- 
tion labels, i.e. belong in the denotations of their types. 

Note that the above types are restricted in that they can only be 
applied to finite arguments. In future work, we could address this 
issue with unrefined versions of primitive types that soundly allow 
operation on arbitrary arguments. For example, with the current 
type for +, addition of potentially diverging expressions is rejected. 



Thus, we could define an unrefined signature 

Ty(+) = x:Int — > y:Int — > Int 

and allow the two types of + to co-exist (as an intersection type), 
where the type checker would choose the precise refined type if and 
only if both of +'s arguments are finite. 

Diverging Fixpoints (f ix T ) Next, note that the only place where 
divergence enters the picture is through the fixpoint combinators 
used to encode recursion. For any function or basic type r = ri — > 
. . . — > r n , we define the result to be the type r„. 

For each r whose result is a Div type, there is a diverging 
fixpoint combinator f ix T , such that 

«5(fix T ,/) = /(fix T /) 
Ty(f ix r ) = {t — y t ) — > t 

i.e., f ix T yields recursive functions of type r. Of course, fix T 
belongs in the denotation of its type 1251 only if the result type is a 
Div type (and not when the result is a Wnf or Fin type). Thus, we 
restrict diverging fixpoints to functions with Div result types. 

Indexed Fixpoints (tf ix") For each type r whose result is a Fin 
type, we have a family of indexed fixpoints combinators tf ix": 

<5(tfix",/) = Am./ m (tf ix" 1 /) 

Ty(tf ix") = (n:rtat — > t„ — > r) — > r n 

where, r n = {?;:nat* | v < n} — > r 

r n is a weakened version of r that can only be invoked on inputs 
smaller than n. Thus, we enforce termination by requiring that 
tf ix" is only called with m that are strictly smaller than n. As 
the indices are well-founded nats, evaluation will terminate. 

Terminating Fixpoints (tf ix T ) Finally, we use the indexed com- 
binators to define the terminating fixpoint combinator tf ix r as: 

<5(tfix T ,/) = Xn.f n (tf ix; 1 /) 

Ty(tf ix T ) = (n:nat^ — > T n — > t) — > nat^ — > r 

Thus, the top-level call to the recursive function requires a nat^ 
parameter n that acts as a starting index, after which, all "recursive" 
calls are to combinators with smaller indices, ensuring termination. 

Example: Factorial Consider the factorial function: 

fac = An. A f.case _ = (n = 0) of < * rue — * ^ I 

■' v ' 1 _ — > n X f(n — 1) J 

Let t = nat^. We prove termination by typing 
0 \~d tf ix T fac : nat^ — > r 

To understand why, note that tf ix^ is only called with arguments 
strictly smaller than n 

tf ix T fac n ^* fac n (tf ix^ fac) 

n X (tf ix^ fac (n - 1)) 

^* n X (fac (n - 1) (tf ix™ -1 fac)) 

^* n X n - 1 X (tf ix^ 1 fac (n - 2)) 

n X n — 1 x ... X (tf ix^ fac 0) 

<- >* n X n — 1 X ... X (fac 0 (tf ix° fac)) 

s>* nxn — 1 x ... x 1 

Soundness of Stratification To formally prove that stratification is 
soundly enforced, it suffices to prove that the Denotation Lemma[2] 
holds for \ D . This, in turn, boils down to proving that each (strati- 
fied) constant belongs in its type's denotation, i.e. each c £ [Ty(c)J 
or that the Lemma 111 holds for A D . The crucial part of the above 
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is proving that the indexed and terminating fixpoints inhabit their 
types' denotations. 

Theorem 3. [Fixpoint Typing] 

• fix r G [Ty(f ix T )], 

• Vn.tfix™ G [Ty(tf ix™)], 

• tf ix T G [Ty(tf ix T )]. 

With the above we can prove soundness of Stratification as a 
corollary Denotation Lemma [2] given the interpretations of the 
stratified types. 

Corollary 1. [Soundness of Stratification] 

1. lf§ \~o e : T , then evaluation of e is finite. 

2. lf%\- D e : t 1 , then e reduces to WHNF. 

3. If '0 \~d e : {v:r \ p}, then p cannot diverge. 

Finally, as a direct implication the well-formedness rule WF- 
BASE-D we conclude[3] i.e. that refinements cannot diverge. 

4.3 Verification With Stratified Types 

We put the pieces together to obtain an algorithmic implication rule 
^-BASE-D instead of the undecidable ^-BASE (from Figure [4}. 
Intuitively, each closing substitution 8 corresponds to a set of logi- 
cal assignments (\9\). Thus, we will translate V into logical formula 
(]r|) and denotation inclusion into logical implication such that: 

• 6 6 IT] iff all cr g fl6>D satisfy (jr|), and 

• 9{v:B | p^ C 6{v:B | p 2 } iff all a G (]6>|) satisfy Pl =>■ p 2 . 

Translating Refinements & Environments To translate environ- 
ments into logical formulas, recall that 6 G [T] iff for each x:t G 
T, we have 6(x) G [0(r)j. Thus, 

(\xi:ti, ■ ■ ■ ,x n -.T n \) = <\xi:ti\) A ... A (]x n :r n |) 

How should we translate a single binding? Since a binding denotes 

|{a;:i3 | p}] = {e | if e w then p [w / x] true} 

a direct translation would require a logical value predicate Val(a;), 
which we could use to obtain the logical translation 

<\{x:B | jo} 1) = ^Val(:r)Vp 

This translation poses several theoretical and practical problems 
that preclude the use of existing SMT solvers (as detailed in Ej8j. 
However, our stratification guarantees (cf. (10) , (jTTJ) that labeled 
types reduces to values, and so we can simply conservatively trans- 
late the Div and labeled (Wnf, Fin) bindings as: 

<\{x:B \p}\) = true <\{x:B l p}\) = p 

Soundness We prove soundness by showing that the decidable 
implication ^-BASE-D approximates the undecidable ;<-BASE. 

Theorem 4. If(\T\) =>■ pi => p 2 is valid then 

T \- v {v:B | Pl } < {v:B \ p 2 } 

To prove the above, let VC = (]r|) =>■ p\ =>■ p 2 . We prove that 
if the VC is valid then V \- v {v.b \ pi} < {v.b \ p 2 }. This fact 
relies crucially on a notion of tracking evaluation which allows us 
to reduce a closing substitution 0 to a lifted substitution 9 ± , written 
6 8 ± , after which we prove: 

Lemma 3. [Lifting] 6(e) ^* c iff 36 6> ± s.t. 8 ± (e) ^* c. 

We combine the Lifting Lemma and the equivalence Theorem[2] 
to prove that the validity of the VC demonstrates the denotational 
containment V0 G frj.[0({v.B \ P i})j C [6({v.B \ p 2 })j. The 
soundness of algorithmic typing follows from Theorems [4] and [T] 



Theorem 5. [Soundness of A ] 

• Approximation: lf%\~t>e:T then 0 \~u e : r. 

• Crash-Freedom: If$ \~d e : r then e <f^* crash. 

5. Implementation: LiquidHaskell 

We have implemented X D in LIQUIDHASKELL (Q. Next, we 
describe the key steps in the transition from \ D to Haskell. 

5.1 Termination 

Haskell's recursive functions of type nat^ — > r are represented, 
in GHC's Core 1 30J as let rec / = Xn.e which is operationally 
equivalent to let / = tf ix r (Xn.Xf.e). Given the type of tf ix T , 
checking that / has type nat^ — > r reduces to checking e in a 
termination-weakened environment where 

/ : {u:nat^ | v < n} — > r 

Thus, LIQUIDHASKELL proves termination just as \ D does: by 
checking the body in the above environment, where the recursive 
binder is called with nat inputs that are strictly smaller than n. 

Default Metric For example, LIQUIDHASKELL proves that 
fac n = if n == 0 then 1 else n * fac (n-1) 

has type nat^ — > nat^ by typechecking the body of fac in a 
termination-weakened environment fac : {v:nat^ v < n} — > nat 
The recursive call generates the subtyping query: 

n:{0 < n}, -.(n = 0) \~ D {v = n - 1} ^ {0 < v A v < n} 

Which reduces to the valid VC 

0<nA^(n = 0)^(v = n-l)^(0<vAv<n) 

proving that fac terminates, in essence because the first parameter 
forms a well-founded decreasing metric. 

Refinements Enable Termination Consider Euclid's CCD: 

gcd :: a:Nat -> {v:Nat I v < a} -> Nat 
gcd a 0 = a 

gcd a b = gcd b (a "mod' b) 

Here, the first parameter is decreasing, but this requires the fact that 
the second parameter is smaller than the first and that mod returns 
results smaller than its second parameter. Both facts are easily 
expressed as refinements, but elude non-extensible checkers 1151 . 

Explicit Termination Metrics The indexed-fixpoint combinator 
technique is easily extended to cases where some parameter other 
than the first is the well-founded metric. For example, consider: 

tfac :: Nat -> n : Nat -> Nat / [n] 

tfac x n | n == 0 = x 

I otherwise = tfac (n*x) (n-1) 

We specify that the last parameter is decreasing by using an explicit 
termination metric / [n] in the type. LIQUIDHASKELL desugars 
the termination metric into a new nat -valued ghost parameter d 
whose value is always equal to the termination metric n: 

tfac :: d : Nat -> Nat -> { n : Nat I d = n} -> Nat 
tfac d x n | n 0 = x 

I otherwise = tfac (n-1) (n*x) (n-1) 

Type checking, as before, checks the body in an environment where 
the first argument of tfac is weakened, i.e., requires proving d 
> n-1. So, the system needs to know that the ghost argument d 
represents the decreasing metric. We capture this information in the 
type signature of tfac where the last argument exactly specifies 
that d is the termination metric n, i.e., d = n. Note that since the 
termination metric can depend on any argument, it is important to 
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refine the last argument, so that all arguments are in scope, with the 
fact that d is the termination metric. 

To generalize, desugaring of termination metrics proceeds as 
follows. Let / be a recursive function with parameters x, and 
termination metric fl{x). Then LlQUIDHASKELL will 

• add a nat -valued ghost first parameter d in the definition of /, 

• weaken the last argument of / with the refinement d = n(x), 

• at each recursive call of / e, apply /i(e) as the first argument. 

Explicit Termination Expressions Let us now apply the previous 
technique in a function where none of the parameters themselves 
decrease across recursive calls, but there is some expression that 
forms the decreasing metric. Consider range lo hi, which re- 
turns the list of ints from lo to hi: We generalize the explicit 
metric specification to expressions like hi-lo. LlQUIDHASKELL 
desugars the expression into a new nat-valued ghost parameter 
whose value is always equal to hi-lo, that is: 

range :: lo:Nat -> {hi:Nat | hi >= lo} -> [Nat] 
/ [hi-lo] 

range lo hi | lo < hi - lo : range (lo + 1) hi 
I _ = [] 

Here, neither parameter is decreasing (indeed, the first one is in- 
creasing) but hi-lo decreases across each call. We generalize the 
explicit metric specification to expressions like hi-lo. LlQUID- 
HASKELL desugars the expression into a new nat-valued ghost 
parameter whose value is always equal to hi-lo, that is: 

range lo hi = go (hi-lo) lo hi 
where 

go :: d:Nat -> lo:Nat 

-> {hi:Nat I d = hi - lo} -> [Nat] 
go d lo hi 

I lo < hi = 1 : go (hi-(lo+l)) (lo+l) hi 
I _ = [] 

After which, it proves go terminating, by showing that the first 
argument d is a nat that decreases across each recursive call. 
Recursion over Data Types The above strategy generalizes easily 
to functions that recurse over (finite) data structures like arrays, 
lists, and trees. In these cases, we simply use measures to project 
the structure onto nat, thereby reducing the verification to the 
previously seen cases. For each user defined type, e.g. 

data L [sz] a = N | C a (La) 
we can define a measure 

measure sz : : L a -> Nat 

sz (C x xs) = 1 + (sz xs) 
sz N =0 

and use it as the decreasing metric to prove that map terminates: 

map : : (a — > b) -> xs:L a -> L b / [sz xs] 
map f (C x xs) = C (fx) (map f xs) 
map f N = N 

Generalized Metrics Over Datatypes Finally, in many functions 
there is no single argument whose (measure) provably decreases. 
For example, consider: 

merge :: xs :_ -> ys:_ -> _ J [sz xs + sz ys] 

merge (C x xs) (C y ys) 

x < y = x "C" (merge xs (y ys)) 

otherwise = y "C" (merge (x "C" xs) ys) 

from the homonymous sorting routine. Here, neither parameter 
decreases, but the sum of their sizes does. As before LlQUID- 
HASKELL desugars the decreasing expression into a ghost param- 
eter and thereby proves termination (assuming, of course, that the 
inputs were finite lists, i.e. a.) 



Automation: Default Size Measures Structural recursion on the 
first argument is a common pattern in Haskell code. LlQUID- 
HASKELL automates termination proofs for this common case, 
by allowing users to specify a size measure for each data type, (e.g. 
sz for L a). Now, if no termination metric is given, by default 
LlQUIDHASKELL assumes that the first argument whose type has 
an associated size measure decreases. Thus, in the above, we need 
not specify metrics for f ac or god or map as the size measure is au- 
tomatically used to prove termination. This simple heuristic allows 
us to automatically prove 67% of recursive functions terminating. 

5.2 Non-termination 

By default, LlQUIDHASKELL checks that every function is ter- 
minating. We show in [|6]that this is in fact the overwhelmingly 
common case in practice. However, annotating a function as lazy 
deactivates LlQUIDHASKELL's termination check (and marks the 
result as a Div type). This allows us to check functions that are non- 
terminating, and allows LlQUIDHASKELL to prove safety prop- 
erties of programs that manipulate infinite data, such as streams, 
which arise idiomatically with Haskell's lazy semantics. For exam- 
ple, consider the classic repeat function: 

repeat x - x repeat x 

We cannot use the tf ix combinators to represent this kind of recur- 
sion, and hence, use the non-terminating f ix combinator instead. 

Let us see how we can use refinements to statically distinguish 
between finite and infinite streams. The direct, global route of using 
an inductively defined measure to describe infinite lists is unavail- 
able as such a measure, and hence, the corresponding refinement 
would be non-terminating. Instead, we describe infinite lists in lo- 
cal fashion, by stating that each tail is non-empty. 

Step 1: Abstract Refinements We can parametrize a datatype with 
abstract refinements that relate sub-parts of the structure (33 1. For 
example, we parameterize the list type as: 

data L a <p : : L a -> Prop> 

= N | C a {v: L<p> a I (p v) } 

which parameterizes the list with a refinement p which holds for 
each tail of the list, i.e. holds for each of the second arguments to 
the C constructor in each sub-list. 

Step 2: Measuring Emptiness Now, we can write a measure that 
states when a list is empty 

measure emp : : L a -> Prop 
emp N = true 

emp (C x xs) = false 

As described in ^4] LlQUIDHASKELL translates the abstract refine- 
ments and measures into refined types for N and c. 

Step 3: Specification & Verification Finally, we can use the abstract 
refinements and measures to write a type alias describing a refined 
version of L a representing infinite streams: 

type Stream a = 

{xs: L <{\v -> not (emp v)}> a I not (emp xs)} 

We can now type repeat as: 

lazy repeat : : a -> Stream a 
repeat x = x *C 1 repeat x 

The lazy keyword deactivates termination checking, and marks 
the output as a Div type. Even more interestingly, we can prove 
safety properties of infinite lists, for example: 

take : : Nat -> Stream a -> L a 

take 0 _ = N 

take n (C x xs) = x 'C' take (n-1) xs 
take _ N = error "never happens" 
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LlQUIDHASKELL proves, similar to the head example from Sj2] 
that we never match a N when the input is a stream. 

Finite vs. Infinite Lists Thus, the combination of refinements 
and labels allows our stratified type system to specify and ver- 
ify whether a list is finite or infinite. Note that: a represents 
finite lists i.e. those produced using the (inductive) terminating fix- 
point combinators, a represents (potentially) infinite lists which 
are guaranteed to reduce to values, i.e. non-diverging computations 
that yield finite or infinite lists, and L a represents computations 
that may diverge or produce a finite or infinite list. 

6. Evaluation 

Our goal is to build a practical and effective SMT & refinement 
type-based verifier for Haskell. We have shown that lazy evalua- 
tion requires the verifier to reason about divergence; we have pro- 
posed an approach for implicitly reasoning about divergence by ea- 
gerly proving termination, thereby optimizing the precision of the 
verifier. Next, we describe an experimental evaluation of our ap- 
proach that uses LlQUIDHASKELL to prove termination and func- 
tional correctness properties of a suite of widely used Haskell li- 
braries totaling more than 10KLOC. Our evaluation seeks to de- 
termine whether our approach is suitable for a lazy language (i.e. 
do most Haskell functions terminate?), precise enough to capture 
the termination reasons (i.e. is LlQUIDHASKELL able to prove that 
most functions terminate?), usable without placing an unreason- 
ably high burden on the user in the form of explicit termination 
annotations, and effective enough to enable the verification of func- 
tional correctness properties. For brevity, we omit a description of 
the properties other than termination, please see 1341 for details. 

Implementation LlQUIDHASKELL takes as input: (1) A Haskell 
source file, (2) Refinement type specifications, including refined 
datatype definitions, measures, predicate and type aliases, and func- 
tion signatures, and (3) Predicate fragments called qualifiers which 
are used to infer refinement types using the abstract interpretation 
framework of Liquid Typing |26|. The verifier returns as output, 
SAFE or UNSAFE, depending on whether the code meets the spec- 
ifications or not, and, importantly for debugging the code (or spec- 
ification!) the inferred types for all sub-expressions. 

Benchmarks As benchmarks, we used the following libraries: 
GHC.List and Data. List, which together implement many 
standard list operations, Data . Set . Splay, which implements 
an splay functional set, Data . Map . Base, which implements a 
functional map, Vector-Algorithms, which includes a suite 
of "imperative" array-based sorting algorithms, Bytestring, a 
library for manipulating byte arrays, and Text, a library for high- 
performance Unicode text processing. These benchmarks represent 
a wide spectrum of idiomatic Haskell codes: the first three are 
widely used libraries based on recursive data structures, the fourth 
and fifth perform subtle, low-level arithmetic manipulation of array 
indices and pointers, and the last is a rich, high-level library with 
sophisticated application-specific invariants, well outside the scope 
of even Haskell's expressive type system. Thus, this suite provides a 
diverse and challenging test-bed for evaluating LlQUIDHASKELL. 

Results Table [T] summarizes our experiments, which covered 39 
modules totaling 10,209 non-comment lines of source code. The 
results were collected on a machine with an Intel Xeon X5600 and 
32GB of RAM (no benchmark required more than 1GB). Timing 
data was for runs that performed full verification of safety and 
functional correctness properties in addition to termination. 

• Suitable: Our approach of eagerly proving termination is in 
fact, highly suitable: of the 504 recursive functions, only 12 
functions were actually non-terminating (i.e. non-inductive). 
That is, 97.6% of recursive functions are inductively defined. 
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Table 1. A quantitative evaluation of our experiments. LOC is the number of non- 
comment lines of source code as reported by sloccount. Fun is the total number of 
functions in the library. Rec is the number of recursive functions. Div is the number 
of functions marked as potentially non-terminating. Hint is the number of termination 
hints, in the form of termination expressions, given to LlQUIDHASKELL. Time is the 
time, in seconds, required to run LlQUIDHASKELL. 

• Precise: Our approach is extremely precise, as refinements pro- 
vide auxiliary invariants and extensibility that is crucial for 
proving termination. We successfully prove that 96.0% of re- 
cursive functions terminate. 

• Usable: Our approach is highly usable and only places a modest 
annotation burden on the user. The default metric, namely the 
first parameter with an associated size measure, suffices to 
automatically prove 65.7% of recursive functions terminating. 
Thus, only 34.3% require explicit termination metric, totaling 
about 1.7 witnesses (about 1 line each) per 100 lines of code. 

• Effective: Our approach is extremely effective at improving the 
precision of the overall verifier (by allowing the VC to use 
facts about binders that provably reduce to values.) Without 
the termination optimization, i.e. by only using information for 
matched-binders (thus in WHNF), LlQUIDHASKELL reports 
1,395 unique functional correctness warnings - about 1 per 7 
lines. With termination information, this number goes to zero. 

7. Related Work 

Next we situate our work with closely related lines of research. 

Dependent Types are the basis of many verifiers, or more generally, 
proof assistants. In this setting arbitrary terms may appear inside 
types, so to prevent logical inconsistencies, and enable the checking 
of type equivalence, all terms must terminate. "Full" dependently 
typed systems like Coq (5), Agda 1241 . and Idris [7| typically 
use structural checks where recursion is allowed on sub-terms of 
ADTs to ensure that all terms terminate. We differ in that, since the 
refinement logic is restricted, we do not require that all functions 
terminate, and hence, we can prove properties of possibly diverging 
functions like collatz as well as lazy functions like repeat. 
Recent languages like Aura [18 1 and Zombie [9| allow general 
recursion, but constrain the logic to a terminating sublanguage, as 
we do, to avoid reasoning about divergence in the logic. In contrast 
to us, the above systems crucially assume call-by-value semantics 
to ensure that binders are bound to values, i.e. cannot diverge. 

Refinement Types are a form of dependent types where invariants 
are encoded via a combination of types and predicates from a re- 
stricted SMT-decidable logic |4, 13, 27, 37 1. The restriction makes 
it safe to support arbitrary recursion, which has hitherto never been 
a problem for refinement types. However, we show that this is be- 
cause all the above systems implicitly assume that all free variables 
are bound to values, which is only guaranteed under CBV and, as 
we have seen, leads to unsoundness under lazy evaluation. 

Tracking Divergent Computations The notion of type stratification 
to track potentially diverging computations dates to at least 1101 
which uses f to encode diverging terms, and types fix as (r — > 
f) — > f). More recently, |8| tracks diverging computations within 
a partiality monad. Unlike the above, we use refinements to obtain 
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terminating fixpoints (tf ix), which let us prove the vast majority 
(of sub-expressions) in real world libraries as non-diverging, avoid- 
ing the restructuring that would be required by the partiality monad. 

Termination Analyses Various authors have proposed techniques 
to verify termination of recursive functions, either using the "size- 
change principle" I19II28I . or by annotating types with size indices 
and verifying that the arguments of recursive calls have smaller in- 
dices (2l 1171 . Our use of refinements to encode terminating fix- 
points is most closely related to |36], but this work also crucially 
assumes CBV semantics for soundness. 

AProVE 1 15 1 implements a powerful, fully-automatic termina- 
tion analysis for Haskell based on term-rewriting. While we could 
use an external analysis like AProVE, we have found that encod- 
ing the termination proof via refinements provided advantages that 
are crucial in large, real-world code bases. Specifically, refinements 
let us (1) prove termination over a subset (not all) of inputs; many 
functions (e.g. fac) terminate only on Nat inputs and not all int 
s, (2) encode pre-conditions, post-conditions, and auxiliary invari- 
ants that are essential for proving termination, (e.g. gcd), (3) eas- 
ily specify non-standard decreasing metrics and prove termination, 
(e.g. range). In each case, the code could be (significantly) rewrit- 
ten to be amenable to AProVE but this defeats the purpose of an 
automatic checker. Finally, none of the above analyses have been 
empirically evaluated on large and complex real-world libraries. 

Static Contract Checkers like ESCJava 1141 are a classical way 
of verifying correctness through assertions and pre- and post- 
conditions. Side-effects like modifications of global variables are 
a well known issue for static checkers for imperative languages; 
the standard approach is to use an effect analysis to determine the 
"modifies clause" i.e. the set of globals modified by a procedure. 
Similarly, one can view our approach as implicitly computing the 
non-termination effects. [ 38 1 describes a static contract checker for 
Haskell that uses symbolic execution to unroll procedures upto 
some fixed depth, yielding weaker "bounded" soundness guar- 
antees. Similarly, Zeno |29| is an automatic Haskell prover that 
combines unrolling with heuristics for rewriting and proof-search. 
Based on rewriting, it is sound but "Zeno might loop forever" when 
faced with non-termination. Finally, the Halo [ 35 1 contract checker 
encodes Haskell programs into first-order logic by directly mod- 
eling the code's denotational semantics, again, requiring heuris- 
tics for instantiating axioms describing functions' behavior. Halo's 
translation of Haskell programs directly encodes constructors as 
uninterpreted functions, axiomatized to be injective (as the denota- 
tional semantics requires). This heavyweight encoding is more pre- 
cise than predicate abstraction but leads to model-theoretic prob- 
lems (outlined in the Halo paper) and affects the efficiency of the 
encoding when scaling to larger programs (see also [8] paragraph 
B) in the lack of specialized decisions procedures. Unlike any of 
the above, our type-based approach does not rely on heuristics for 
unrolling recursive procedures, or instantiating axioms. Instead we 
are based on decidable SMT validity checking and abstract inter- 
pretation 1261 which makes the tool predictable and the overall 
workflow scale to the verification of large, real-world code bases. 

8. Conclusions & Future Work 

Our goal is to use the recent advances in SMT solving to build 
automated refinement type-based verifiers for Haskell. In this pa- 
per, we have made the following advances towards the goal. First, 
we demonstrated how the classical technique for generating VCs 
from refinement subtyping queries is unsound under lazy evalu- 
ation. Second, we have presented a solution that addresses the un- 
soundness by stratifying types into those that are inhabited by terms 
that may diverge, those that must reduce to Haskell values, and 
those that must reduce to finite values, and have shown how refine- 



ment types may themselves be used to soundly verify the stratifica- 
tion. Third, we have developed an implementation of our technique 
in LlQUIDHASKELL and have evaluated the tool on a large corpus 
comprising 10KLOC of widely used Haskell libraries. Our exper- 
iments empirically demonstrate the practical effectiveness of our 
approach: using refinement types, we were able to prove 96% of 
recursive functions as terminating, and to crucially use this infor- 
mation to prove a variety of functional correctness properties. 

Limitations While our approach is demonstrably effective in prac- 
tice, it relies critically on proving termination, which, while inde- 
pendently useful, is not wholly satisfying in theory, as adding di- 
vergence shouldn't break a safety proof. Our system can prove a 
program safe, but if the program is modified by making some func- 
tions non-deterministically diverge, then, we may no longer be able 
to prove safety. Thus, in future work, it would be valuable to ex- 
plore other ways to reconcile laziness and refinement typing. We 
outline some routes and the challenging obstacles along them. 

A. Convert Lazy To Eager Evaluation One alternative might be to 
translate the program from lazy to eager evaluation, for example, to 
replace every (thunk) e with an abstraction A().e, and every use of 
a lazy value x with an application x (). After this, we could simply 
assume eager evaluation, and so the usual refinement type systems 
could be used to verify Haskell. Alas, no. While sound, this transla- 
tion doesn't solve the problem of reasoning about divergence. A de- 
pendent function type x:Int — s> {u:Int v > x} would be trans- 
formed to x:(() — > Int) — > {v.Int | v > x ()} The transformed 
type is problematic as it uses arbitrary function applications in the 
refinement logic! The type is only sensible if x () provably reduces 
to a value, bringing us back to square one. 

B. Explicit Reasoning about Divergence Another alternative is to 
enrich the refinement logic with a value predicate Val(:r) that is 
true when "x is a value" and use the SMT solver to explicitly rea- 
son about divergence. (Note that Va I (x) is equivalent to introducing 
a _L constant denoting divergence, and writing (x 7^ _L).) Unfortu- 
nately, this Val(x) predicate takes the VCs outside the scope of the 
standard efficiently decidable logics supported by SMT solvers. To 
see why, recall the subtyping query from good in ij2] With explicit 
value predicates, this subtyping reduces to the VC: 

(Val(x) => x > 0) 
(Valfo) =► y > 0) 

To prove the above valid, we require the knowledge that (v = y+1) 
implies that y is a value, i.e. that Val(y) holds. This fact, while 
obvious to a human reader, is outside the decidable theories of 
linear arithmetic of the existing SMT solvers. Thus, existing solvers 
would be unable to prove \\2\ valid, causing us to reject good. 

Possible Fix: Explicit Reasoning With Axioms? One possible fix 
for the above would be to specify a collection of axioms that 
characterize how the value predicate behaves with respect to the 
other theory operators. For example, we might specify axioms like: 

Vx,y, z.(x = y + z) => (Val(x) A Val(y) A Val(z)) 

Vx,y.(x<y) => (Val(x) A Val(y)) 

etc.. However, this is a non-solution for several reasons. First, it 
is not clear what a complete set of axioms is. Second, there is the 
well known loss of predictable checking that arises when using ax- 
ioms, as one must rely on various brittle, syntactic matching and 
instantiation heuristics [112:]. It is unclear how well these heuristics 
will work with the sophisticated linear programming-based algo- 
rithms used to decide arithmetic theories. Thus, proper support for 
value predicates could require significant changes to existing deci- 
sion procedures, making it impossible to use existing SMT solvers. 

Possible Fix: Explicit Reasoning With Types ? Another possible fix 
would be to encode the behavior of the value predicates within the 



(v = y + 1) =>• (v > 0) 



(12) 
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refinement types for different operators, after which the predicate 
itself could be treated as an uninterpreted function in the refinement 
logic |6|. For instance, we could type the primitives: 

(+) : : x : Int -> y : Int 

-> {v | v = x + y && Val x ss Val y} 

(<) : : x : Int -> y : Int 

-> {v | v <=> x < y && Val x && Val y} 

While this approach requires no changes to the SMT machinery, it 
makes specifications complex and verbose. We cannot just add the 
value predicates to the primitives' specifications. Consider 

choose b x y = if b then x+1 else y+2 

To reason about the output of choose we must type it as: 

choose :: Bool -> x:Int -> y:Int 

-> {v| (v > x && Val x) | | (v > y && Val y) } 

Thus, the value predicates will pervasively clutter all signatures 
with strictness information, making the system unpleasant to use. 

Divergence Requires 3-Valued Logic Finally, for either "fix", the 
value predicate poses a model-theoretic problem: what is the mean- 
ing of Val (x)? One sensible approach is to extend the universe with 
a family of distinct _L constants, such that Val(_L) is false. These 
constants lead inevitably into a three- valued logic (in order to give 
meaning to formulas like _L = _L), Thus, even if we were to find 
a way to reason with the value predicate via axioms or types, we 
would have to ensure that we properly handled the 3-valued logic 
within existing 2-valued SMT solvers. 

Future Work Thus, in future work it would be worthwhile to ad- 
dress the above technical and usability problems to enable explicit 
reasoning with the value predicate. This explicit system would 
be more expressive than our stratified approach, e.g. would let 
us check let x = collatz 10 in 12 'div' x+1 by encod- 
ing strictness inside the logic. Nevertheless, we suspect such a ver- 
ifier would use stratification to eliminate the value predicate in the 
common case. At any rate, until these hurdles are crossed, we can 
take comfort in stratified refinement types and can just eagerly use 
termination to prove safety for lazy languages. 
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